The Automatic Creation of Lexical Entries for a Multilingual MT System
نویسندگان
چکیده
In this paper, we describe a method of extracting information from an on-line resource for the consmaction of lexical entries for a multi-lingual, interlingual MT system (ULTRA). We have been able to automatically generate lexical entries for interlingual concepts corresponding to nouns, verbs, adjectives and adverbs. Although several features of these entries continue to be supplied manually we have greatly decreased the time required to generate each entry and see this as a promising method for the creation of largescale lexicons.
منابع مشابه
Construction of a Chinese-english Verb Lexicon for Embedded Machine Translation in Cross-language Information Retrieval
This paper addresses the problem of automatic acquisition of lexical knowledge for rapid construction of MT engines multilingual applications. We describe new techniques for large-scale construction of a Chinese-English verb lexicon and we evaluate the coverage and eeectiveness of the resulting lexicon for a structured MT approach that is embedded in a cross-language information retrieval syste...
متن کاملTwo Principles and Six Techniques for Rapid Mt Development
In this paper we describe a range of techniques used at NMSU CRL for accelerating the development of MT systems. These techniques enable semi-automatic development of a number of components of a multilingual MT system, thereby enabling rapid deployment of MT capabilities in a new language. First, we describe the core multi-engine, multilingual architecture that enables the different techniques ...
متن کاملConstruction of a Chinese-english Verb Lexicon for Machine Translation
This paper addresses the problem of automatic acquisition of lexical knowledge for rapid construction of MT engines in multilingual applications. We describe new techniques for large-scale construction of a Chinese-English verb lexicon and we evaluate the coverage and eeectiveness of the resulting lexicon. Leveraging oo an existing Chinese conceptual database called HowNet and a large, semantic...
متن کاملThe Habanera Lexical Knowledge Base Management System
Habanera is a multipurpose multilingual lexical knowledge base that is developed at CRL to be used as a central repository of multilingual lexical data. The knowledge base contains a set of dictionaries and relations between entries, within a dictionary (e.g., synonymy) as well as between entries of different dictionaries (e.g., translation). The format of monolingual lexical entries is left re...
متن کاملMultiVal - towards a multilingual valence lexicon
MultiVal is a valence lexicon derived from lexicons of computational HPSG grammars for Norwegian, Spanish and Ga (ISO 639-3, gaa), with altogether about 22,000 verb entries and on average more than 200 valence types defined for each language. These lexical resources are mapped onto a common set of discriminants with a common array of values, and stored in a relational database linked to a web d...
متن کامل